poetic form
so much depends / upon / a whitespace: Why Whitespace Matters for Poets and LLMs
Bhyravajjula, Sriharsh, Walsh, Melanie, Preus, Anna, Antoniak, Maria
Whitespace is a critical component of poetic form, reflecting both adherence to standardized forms and rebellion against those forms. Each poem's whitespace distribution reflects the artistic choices of the poet and is an integral semantic and spatial feature of the poem. Yet, despite the popularity of poetry as both a long-standing art form and as a generation task for large language models (LLMs), whitespace has not received sufficient attention from the NLP community. Using a corpus of 19k English-language published poems from Poetry Foundation, we investigate how 4k poets have used whitespace in their works. We release a subset of 2.8k public-domain poems with preserved formatting to facilitate further research in this area. We compare whitespace usage in the published poems to (1) 51k LLM-generated poems, and (2) 12k unpublished poems posted in an online community. We also explore whitespace usage across time periods, poetic forms, and data sources. Additionally, we find that different text processing methods can result in significantly different representations of whitespace in poetry data, motivating us to use these poems and whitespace patterns to discuss implications for the processing strategies used to assemble pretraining datasets for LLMs.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Iowa (0.04)
- North America > United States > Indiana (0.04)
- (14 more...)
Sonnet or Not, Bot? Poetry Evaluation for Large Models and Datasets
Walsh, Melanie, Preus, Anna, Antoniak, Maria
Large language models (LLMs) can now generate and recognize text in a wide range of styles and genres, including highly specialized, creative genres like poetry. But what do LLMs really know about poetry? What can they know about poetry? We develop a task to evaluate how well LLMs recognize a specific aspect of poetry, poetic form, for more than 20 forms and formal elements in the English language. Poetic form captures many different poetic features, including rhyme scheme, meter, and word or line repetition. We use this task to reflect on LLMs' current poetic capabilities, as well as the challenges and pitfalls of creating NLP benchmarks for poetry and for other creative tasks. In particular, we use this task to audit and reflect on the poems included in popular pretraining datasets. Our findings have implications for NLP researchers interested in model evaluation, digital humanities and cultural analytics scholars, and cultural heritage professionals.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- North America > United States > Ohio (0.04)
- (19 more...)
World Poetry Day: Can you tell which poem was written by ChatGPT?
Every year on March 21, UNESCO celebrates World Poetry Day. Adopted in 1999, the occasion honours poets and pays tribute to expanding linguistic variety and sharing oral traditions through poetic forms. Poetry is believed to have originated thousands of years ago and has been kept alive through oral and written forms. There are numerous poetic forms that exist in the world with different structures across cultures. Below are just some of the most common forms of poetry.